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£f) \ Wc present an algorithm that on input of an ?i-vertcx m-edge weighted graph G and a value k, 

produces an incremental sparsifier G with n — 1 + m/k edges, such that the condition number of G with 
G is bounded above by 0(fclog 2 ri) 1 , with probability 1 — p. The algorithm runs in time 
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As a result, we obtain an algorithm that on input of an n x n symmetric diagonally dominant matrix 
A with m non-zero entries and a vector &, computes a vector x satisfying ||x — A + 6|| J 4 < e||A + &|| J 4, in 
C^) ■ expected time 

k*" 0(mlog 2 nlog(l/e)). 

00 ; 

I/"") ■ The solver is based on repeated applications of the incremental sparsifier that produces a chain of graphs 

' which is then used as input to a recursive preconditioned Chebyshev iteration. 

CT) ■ 1 Introduction 

" Fast algorithms for solving linear systems and the related problem of finding a few fundamental eigenvectors 
I ■ is possibly one of the most important problems in algorithm design. It has motivated work on fast matrix 
, multiplication methods, graph separators, and more recently graph sparsifiers. For most applications the 
matrix is sparse, and thus one would like algorithms whose run time is efficient in terms of the number 
of non-zero entries of the matrix. Little is known about how to efficiently solve general sparse systems, 
Ax = b. But substantial progress has been made in the case of symmetric and diagonally dominant (SDD) 
systems, where An > \ in a seminal work, Spielman and Teng showed that SDD systems can be 

solved in nearlydinear time [ST04, EEST05, ST06]. 

Recent research, largely motivated by the Spielman and Teng solver (ST-solver), demonstrates the 
power of SDD solvers as an algorithmic primitive. The ST-solver is the key subroutine of the fastest 
known algorithms for a multitude of problems that include: (i) Computing the first non-trivial (Fiedler) 
eigenvector of the graph, or more generally the first few eigenvectors, with well known applications to 
the sparsest-cut problem [Fie73, ST96, Chu97]; (ii) Generating spectral sparsifiers that also act as cut- 
preserving sparsifiers [SS08]; (iii) Solving linear systems derived from elliptic finite elements discretizations 
of a significant class of partial differential equations [BHV04] . (iv) Generalized lossy flow problems [SD08] ; 
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(v) Generating random spanning trees [KM09] ; and (vi) Several optimization problems in computer vision 
[KMT09, KMST09b] and graphics [MP08, JMD+07]; A more thorough discussion of applications of the 
solver can be found in [SpilO, Ten 10]. 

The ST-solver is an iterative algorithm that produces a sequence of approximate solutions converging 
to the actual solution of the input system Ax = b. The performance of iterative methods is commonly 
measured in terms of the time required to reduce an appropriately defined approximation error by a 
constant factor. Even including recent improvements on some of its components, the time complexity of 
the ST-solver is at least 0(m log 15 n). The large exponent in the logarithm is indicative of the fact that 
the algorithm is quite complicated and lacks practicality. The design of a faster and simpler solver is a 
challenging open question. 

In this paper we present a conceptually simple and possibly practical iterative solver that runs in 
time 0{m log 2 n). Its main ingredient is a new incremental graph sparsification algorithm, which is of 
independent interest. The paper is organized as follows. In Section 2 we review basic notions and we 
introduce notation. In Section 3 we discuss the development of SDD solvers, the algorithmic questions it 
motivates, and the progress on them, with an emphasis on the graph sparsification problem. In Section 
4 we present a high level description of our approach and discuss implications of our solver for the graph 
sparsification problem. The incremental sparsifier is presented and analyzed in Sections 5 and 6. In Section 
7 we explain how it can be used to construct the solver. Finally, in the Appendix we give pseudocode for 
the complete solver. 

2 Preliminaries 

In this Section we briefly recall background facts about Laplacians of weighted graphs. For more details, we 
refer the reader to [RG97] and [BH03]. Throughout the paper, we discuss connected graphs with positive 
edge weights. We use n and m to denote |V| and \E\. 

A symmetric matrix A is positive semi-definite if for any vector x, x T Ax > 0. For such semi-definite 
matrices A, we can also define the A-norm of a vector x by 

II II 2 — T A 

Fix an arbitrary numbering of the vertices and edges of a graph G. Let Wij denote the weight of the 
edge (i,j). The Laplacian Lq of G is the matrix defined by: (i) Lc{i,j) = —Wij, (h) Lo(i,i) = J2i^j w i,j- 
For any vector x, one can check that 

X* LqX = ^ (x u — x v )^w uv . 

u,v£E 

It follows that Lq is positive semi-definite and L^-norm is a valid norm. 

We also define a partial order H on symmetric semi-definite matrices, where A H B if B — A is 
positive semi-definite. This definition is equivalent to x T Ax < x T Bx for all x. We say that a graph H 
k- approximates a graph G if 

Lh -< L G < kL h . 

By the definition of H from above, this relationship is equivalent to x T L^x < x T Lqx < kx t Lhx 
for all vectors x. This implies that the condition number of the pair (Lg,Lh) is upper bounded by k. 
The condition number is an algebraically motivated notion; upper bounds on it are used to predict the 
convergence rate of iterative numerical algorithms. 

3 Prior work on SDD solvers and related graph theoretic problems 

Symmetric diagonally dominant systems are linear-time reducible to linear systems whose matrix is the 
Laplacian of a weighted graph via a construction known as double cover that only doubles the number of 
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non-zero entries in the system [GMZ95, Gre96]. The one-to-one correspondence between graphs and their 
Laplacians allows us to focus on weighted graphs, and interchangeably use the words graph and Laplacian. 

In a ground-breaking approach, Vaidya [Vai91] proposed the use of spectral graph-theoretic properties 
for the design of provably good graph preconditioners, i.e. graphs that -in some sense- approximate the 
input graph, but yet are somehow easier to solve. Many authors built upon the ideas of Vaidya, to 
develop combinatorial preconditioning, an area on the border of numerical linear algebra and spectral 
graph theory [BGH + 05]. The work in the present paper as well as the Spielman and Teng solver is based 
on this approach. It is worth noting that combinatorial preconditioning is only one of the rich connections 
between combinatorics and linear algebra [Chu97, RG97]. 

Vaidya originally proposed the construction of a preconditioner for a given graph, based on a maximum 
weight spanning tree of the graph and its subsequent augmentation with graph edges. This yielded the 
first non-trivial results, an 0((dn) 175 ) time algorithm for maximum degree d graphs, and an 0((dn) 1 ' 2 ) 
algorithm for maximum degree d planar graphs [Jos97]. 

Later, Boman and Hendrickson [BH03] made the crucial observation that the notion of stretch (see 
Section 6 for a definition) is crucial for the construction of a good spanning tree preconditioner; they 
showed that if the non-tree edges have average stretch s over a spanning tree, the spanning tree is an 
0(sm)-approximation of the graph. Armed with this observation and the low-stretch tree of Alon et al. 
[AKPW95], Spielman and Teng [ST03] presented a solver running in time 0(m ). 

The utility of low-stretch trees in SDD solvers motivated further research on the topic. Elkin et al. 
[EEST05] gave an 0(m log 2 n) time algorithm for the computation of spanning trees with total stretch 
0(m log 2 n). More recently, Abraham et. al. presented a nearly tight construction of low-stretch trees 
[ABN08], giving an 0(m log n + n log 2 n) time algorithm that on input a graph G produces a spanning tree 
of total stretch 0(m log n). The algorithm of [EEST05] is a basic component of the ST-solver. While the 
algorithm of [ABN08] didn't improve the ST-solver, it is indispensable to our upper bound. 

The major new notion introduced by Spielman and Teng [ST04] in their nearly-linear time algorithm 
was that of a spectral sparsifier, i.e. a graph with a nearly-linear number of edges that a-approximates a 
given graph for a constant a. Before the introduction of spectral sparsifiers, Benczur and Karger [BK96] 
had presented an 0(m log 3 n) algorithm for the construction of a cut-preserving sparsifier with 0(n log n) 
edges. A good spectral sparsifier is a also a good cut-preserving sparsifier, but the opposite is not necessarily 
true. 

The ST-solver [ST04] consists of a number of major algorithmic components. The base routine is a local 
partitioning algorithm which is the main subroutine of a global nearly-linear time partitioning algorithm. 
The partitioning algorithm is used as a subroutine in the construction of the spectral sparsifier. Finally, the 
spectral sparsifier is combined with the 0(m log 2 n) total stretch spanning trees of [EEST05] to produce a 
(k, 0(k log c n)) ultrasparsifier, i.e. a graph G with n — 1 + (n/k) edges which 0(k log c n)-approximates the 
given graph, for some c > 25. The bottleneck in the complexity of the ST-solver lies in the running time 
of the ultra-sparsification algorithm and the approximation quality of the ultrasparsifier. 

In the special case of planar graphs the ST-solver runs in time 0(n log 2 n). An asymptotically optimal 
linear work algorithm for planar graphs was given in [KM07]. The key observation in [KM07] was that 
despite the fact that planar graphs don't necessarily have spanning trees of average stretch less than 
O(logn), they still have (k,ck\ogk) ultrasparsifiers for a large enough constant c; they can be obtained 
by finding ultrasparsifiers for constant size subgraphs that contain most of the edges of the graph, and 
conceding the rest of the edges in the global ultrasparsifier. In addition, a more practical approach to 
the construction of constant-approximation preconditioners for the case of graphs of bounded average 
degree was given in [KM08]. To this day, the only known improvement for the general case was obtained 
by Andersen et.al [ACL06] who presented a faster and more effective local partitioning routine that can 
replace the partition routine of the spectral sparsifier, improving the complexity of the solver as well. 

Significant progress has been made on the spectral graph sparsification problem. Spielman and Srivas- 
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tava [SS08] showed how to construct a much stronger spectral sparsifier with O(ralogn) edges, by sampling 
edges with probabilities proportional to their effective resistance, if the graph is viewed as an electrical 
network. While the algorithm is conceptually simple and attractive, its fastest known implementation 
still relies on the ST-solver. Leaving the envelope of nearly-linear time algorithms Batson, Spielman and 
Srivastava [BSS09] presented a polynomial time algorithm for the construction of a "twice- Ramanuj an" 
spectral sparsifier with a nearly optimal linear number of edges. Finally, Kolla et al. [KMST09a] gave a 
polynomial time algorithm for the construction of a nearly-optimal (k, 0{k\ogn)) ultrasparsifier. 

4 Our contribution 

In an effort to design a faster sparsification algorithm, we ask: when and why the much simpler faster 
cut-preserving sparsifier of [BK96] fails to work as a spectral sparsifier? Perhaps the essential example is 
that of the cycle and the line graph; while the two graphs have roughly the same cuts, their condition 
number is 0(n). The missing edge has a stretch of 0(n) through the rest of the graph, and thus it has high 
effective resistance; the effective resistance-based algorithm of Spielman and Srivastava would have kept 
this edge. It is then natural to try to design a sparsification algorithm that avoids precisely to generate a 
graph whose "missing" edges have a high stretch over the rest of the original graph. 

This line of reasoning leads us to a conceptually simple sparsification algorithm: Find a low-stretch 
spanning tree with a total stretch of O(mlogn). Scale it up by a factor of k so the total stretch is 
0(m log n/k) and add the scaled up version to the sparsifier. Then over-sample the rest of the edges 
with probability proportional to their stretch over the scaled up tree, taking 0(m log 2 n/k) samples. In 
Sections 5 and 6 we analyze a slight variation of this idea and we show that while it doesn't produce an 
ultrasparsifier, it produces what we call an incremental sparsifier which is a graph with n — 1 + m/k edges 
that 0(k log 2 n)-approximates the given graph 2 . Our proof relies on the machinery developed by Spielman 
and Srivastava [SS08]. 

As we explain in Section 7 the incremental sparsifier is all we need to design a solver that runs in the 
claimed time. Precisely, we prove the following. 

Theorem 4.1 On input annxn symmetric diagonally dominant matrix A with m non-zero entries and a 
vectorb, a vector x satisfying \\x— A + o\\a < e\\A + b\\A, can be computed in expected time 0(m log 2 nlog(l/e)) 

4.1 Implications for the graph sparsification problem 

The only known nearly-linear time algorithm that produces a spectral sparsifier with 0(n log n) edges is 
due to Spielman and Srivastava [SS08] and it is based on O(logn) calls to a SDD linear system solver. Our 
solver brings the running time of the Spielman and Srivastava algorithm to 0{m log 3 n). It is interesting 
that this algebraic approach matches up to log log n factors the running time bound of the purely combi- 
natorial algorithm of Benczur and Karger [BK96] for the computation of the (much weaker) cut-preserving 
sparsifier. We note however that an 0(m + nlog 4 n) time cut-preserving sparsification algorithm was 
recently announced informally [HP10]. 

Sparsifying once with the Spielman and Srivastava algorithm and then applying our incremental spar- 
sifier gives a (k, 0(k log 3 n)) ultrasparsifier that runs in 0(m log 3 n) randomized time. Within the envelope 
of nearly-linear time algorithms, this becomes the best known ultrasparsification algorithm in terms of 
both its quality and its running time. Our guarantee on the quality of the ultrasparsifier is off by a factor 
of 0(log 2 n) comparing to the ultrasparsifier presented in [KMST09a]. In the special case where the input 
graph has 0(n) edges, our incremental sparsifier is a (k, 0{k log 2 n)) ultrasparsifier. 

2 In the latest version of their paper [ST06], Spielman and Teng also construct and use an incremental sparsifier, but they 
still use the term ultrasparsifier for it. 



4 



5 Sparsification by Oversampling 

In this section we revisit a sampling scheme proposed by Spielman and Srivastava for sparsifying a graph 
[SS08]. Consider the following general sampling scheme: 



Sample 

Input: Graph G = (V, E, w), p' : E -> R+, real £. 
Output: Graph G' = (V, E', w'). 

q := C s t log t log(l/£) (* Cs is a known constant *) 
Pe ■= Pe/t 

G' := (V,E',w') with E' = 
for q times do 

Sample one e £ E with probability of picking e being p e . 
Add e to E' with weight = w e /p e 
end for 

For all e G let w e > := w e /q 
return G 1 



Spielman and Srivastava pick p' e = w e R e where R e is the effective resistance of e in G, if G is viewed as 
an electrical network with resistances l/w e . This choice returns a spectral sparsifier. A key to bounding 
the number of required samples is the identity ^2 e p' e = n — 1. Calculating good approximations to the 
effective resistances seems to be at least as hard as solving a system, but as we will see in Section 6, it 
is easier to compute numbers p' e > (w e R e ), while still controlling the size of t = J2ePe- The following 
Theorem considers a sampling scheme based on p'^s with this property. 

Theorem 5.1 (Oversampling) Let G = (V,E,w) be a graph. Assuming that p' e > w e R e for each edge 
e £ E, and £ € 0,(1 /n), the graph G' = Sample(G,p', £) satisfies 

G<2G' < 3G 

with probability at least 1 — £. 

The proof follows closely that Spielman and Srivastava [SS08], with only a minor difference in one 
calculation. Let us first review some necessary lemmas. 

If we assign arbitrary orientations on the edges, then we can define the incidence matrix T G sjj mxn as 
follows: 

{—1 if u is the head of e 
1 if u is the tail of e 
otherwise 

If we let W be the diagonal matrix containing edge weights, then W 1 ^ 2 is a real positive diagonal matrix 
as well since all edge weights are positive. The Laplacian L can be written in terms of W and T as 

L = T T WT = T T W 1/2 W 1/2 T. 



Algorithm Sample forms a new graph by multiplying each edge e by a nonnegative number s e . If S is 
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the diagonal matrix with S(e,e) = s e , the Laplacian of the new graph can be seen to be equal to 

L = T T WT = T T W 1/2 SW 1/2 T. 

Let L + denote the Moore-Penrose of L, i.e. the unique matrix sharing with L its null space, and acting 
as the inverse of L in its range. The key to the proofs of [SS08] is the m x m matrix 

n = W 1/2 TL + T T W 1/2 , 

for which the following lemmas are proved. 

Lemma 5.2 (Lemma 3i in [SS08]) U is a projection matrix, i.e. H 2 = II. 
Lemma 5.3 (Lemma 4 in [SS08]) 

(i - ||nn - n5n|| 2 )L ■< i < (i + ||nn - usu\\ 2 )l. 

We also use Lemma 5.4 below, which is Theorem 3.1 from Rudelson and Vershynin [RV07]. The first 
part of the Lemma was also used as Lemma 5 in [SS08] in a similar way. 

Lemma 5.4 Letp be a probability distribution overVt C R d such thatsupy^ \\y\\2 < M and \\M p (yy T )\\2 < 
1. Let y\ ... y q be independent samples drawn from p, and let 



a = CM\ 



'log 9 



Then: 
1. 



1 q 

E||- J^rf -E(yy T )\\ 2 < a. 



i=l 



Pr[\\- q Y.y^ -nyy T )\\2 >x]< 2e- cx2 ' a \ 



i=l 

Here C and c are fixed constants. 

Proof (of Theorem 5.1) Following the pseudocode of Sample, let t = YlePe an< ^ Q = Cs*k)gilog(l/£). 
It can be seen that 



USU = -^2y iy , 



Q 

T 

i ' 



where the yi are drawn from the distribution 



y = — — !!(■■ e) with probability p e . 

For the distribution y we have E(yy T ) = Iffl = II. Since II is a projection matrix, we have 1 1 IX 1 1 2 < 1- 
So, the condition imposed by Lemma 5.4 on the distribution holds for y. The fact that II is a projection 
matrix also gives 

n( : , e ) T n(:, e ) = (nn)( e ,e)=n(e,e), 
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which we use to bound M as follows. 



M = sup — — ||II(:, e) 1 1 2 = sup — — \/lI(e, e) = sup J -^j\Jw e R e < Vt. (5.1) 

e \/Pe e yjPe e \ Pe 

The last inequality follows from the assumption about the p' e . Recall now that we have log(l/£) < logn 
by assumption, t > ^2 e w e R e by construction, and ^2 e w e R e = n — 1 by Lemma 3 in [SS08]. Combining 
these facts and setting q = est log t log(l/f;) for a proper constant cs, part 1 of Lemma 5.4 gives 



a < 



, clog(2/0 

Now substituting x = \ into part 2 of Lemma 5.4, we get 

*M||-X> y f - E(yy T )\\ 2 > 1/2] < 2e~^ 2 < 2e^l^m) < 



* i=i 

It follows that with probability at least 1 — £ we have 
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which implies ||nSTI — nn| 1 2 < 1/2. The theorem then follows by Lemma 5.3. ■ 
Note. The upper bound for M in inequality 5.1 is in fact the only place where our proof differs from 
that of [SS08]. In their case the last inequality is replaced by an exact inequality, which is possible because 
the exact values for w e R e are used. In our case, by using inexact values we get a weaker upper bound 
which reflects in the density (depending on m, not n) of the incremental sparsifier. It is however enough 
for the solver. 

6 Incremental Sparsifier 

Consider a spanning tree T of G = (V,E,w). Let w'(e) = l/w(e). If the unique path connecting the 
endpoints of e consists of edges e± . . . e^, the stretch of e by T is defined to be 

stretch T (e) = E ^= 1 > ^ ) . 

w'{e) 

Let R e denote the effective resistance of e in G and RT e denote the effective resistance of e in T. We 
have RT e = Yli=i ^/ w i e i)- Thus stretchxie) = w e RT e . By Rayleigh's monotonicity law [DSOO], we have 
RT e > R e , so stretchT(e) > w e R e . As the numbers stretchT(e) satisfy the condition of Theorem 5.1, we 
can use them for oversampling. But at the same time we want to control the total stretch, as it will directly 
affect the total number of samples required in SAMPLE. This leads to taking T to be a low-stretch tree, 
with the guarantees provided by the following result of Abraham, Bartal, and Neiman [ABN08]. 

Theorem 6.1 (Corollary 6 in [ABN08]) Given a graph G = (V,E,w'), LowStretchTree(G) in time 
0(mlog n + nlog 2 n), outputs a spanning tree T of G satisfying X^ee-E = 0(mlogn ■ log log n 3 ). 

Our key idea is to scale up the low-stretch tree by a factor of k, incurring a condition number of k but 
allowing us to sample the non-tree edges aggressively using the upper bounds on their effective resistances 
given by the tree. The details are given in algorithm IncrementalSparsify. 
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IncrementalSparsify 

Input: Graph G, reals k, < £ < 1 

Output: Graph H 

1: T := LowStretchTree(G) 

2: Let T' be T scaled by a factor of k 

3: Let G' be the graph obtained from G 

by replacing T by T" 
4: for e£E do 
5: Calculate stretchy (e) 
6: end for 

7: # := Sample(G", stretchy, l/2£) 
8: return 2ff 



Theorem 6.2 Given a graph G with n vertices, m edges and any values k < m, £ G Jl(l/re), Incremen- 
talSparsify computes a graph H such that: 



• G< H ^ 3kG 

2 



• H has n — 1 + 0{{m/K) log nlog(l/£)) edges, 
with probability at least 1 — £. The algorithm runs in 0(m log n + (re log 2 re + m log 3 n/n) log(l/^)) time. 

Proof We first bound the condition number. Since the weight of an edge is increased by at most a factor 
of k, we have G H G' ^ kG. Furthermore, the effective resistance along the tree of each non-tree edge 
decreases by a factor of k. Thus IncrementalSparsify sets p' e = 1 if e G T and stretchx{e) / k otherwise, 
and invokes Sample to compute a graph i7 such that with probability at least 1 — £, we get 

G <G' < H < 3G" ^ 3kG. 

We next bound the number of non-tree edges. Let t' = ^2 e ^ T stretch?' (e), so t' = 0({m/n) log n). 
Then for the number t used in Sample we have t = t' + n — 1 and g = C s t logt log(l/£) is the number 
of edges sampled in the call of Sample. Let be a random variable which is 1 if the i th edge picked by 
Sample is a non-tree edge and otherwise. The total number of non-tree edges sampled is the random 
variable X = Yli=i -^i, and its expected value can be calculated using the fact Pr{Xi = 1) = t'/t: 

E[X] = ^ = *' Cstl ° g ^ g(1/0 = 0((m/n) log 2 nlog(l/fl)- 
A standard form of Chernoff's inequality is: 



Pr[X > (1 + 5)E[X}} < 



e 



Letting 5 = 2, and using the assumption k < m, we get Pr(X > 3£/[X]) < (e 2 /27) E ^ < l/n c , for any 
constant c. Hence, the probability that IncrementalSparsify succeeds, with respect to both the number 
of non-tree edges and the condition number, is at least 1 — £. 

We now consider the time complexity. We first generate a low-stretch spanning tree in 0{m log re + 
n log 2 re) time. We then compute the effective resistance of each non-tree edge by the tree. This can be done 
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using Tarjan's off-line LCA algorithm [Tar 79], which takes 0(m) time [GT83]. We next call SAMPLE 
with parameters that make it draw 0{{n + m/ Klogn) lognlog(l/£)) samples (precisely, 0{t logt log(l/£)) 
samples where t = 0(n + raj k log n)). To compute each sample efficiently, we assign each edge an interval 
on the unit interval [0, 1] with length corresponding to its probability, so that no two intervals overlap. At 
each sampling iteration we pick a random value in [0, 1] and do a binary search in order to find the interval 
that contains it in 0(log n) time. Thus the cost of a call to SAMPLE is 0((n log 2 n+m/ k, log 3 n) log(l/£)). 
■ 

7 Solving using Incremental Sparsifiers 

The solver of Spielman and Teng [ST06] consists of two phases. The preconditioning phase builds a chain 
of progressively smaller graphs C = {A\, B±, A2, ■ ■ ■ , A^} starting with A\ = A. The process for building 
C alternates between calls to a sparsification routine UltraSparsify which constructs B\ from Ai and 
a routine GreedyElimination (following below) which constructs Ai + i from Bi. The preconditioning 
phase is independent from the 6-side of the system L^x = b. 



GreedyElimination 

Input: Weighted graph G = (V, E, w) 

Output: Weighted graph G = (V, E, w) 

G:=G 
repeat 

greedily remove all degree- 1 nodes from G 
if deg^v) = 2 and (v, u±), (v, 112) £ Eg then 
w' := w(ui,v)w(u2,v)/ (w(ui,v) + w{u2,v)) 
replace (u\,v,U2) by an edge of weight w' in G 
end if 

until there are no nodes of degree 1 or 2 in G 
return G 



The solve phase passes C, b and a number of iterations t (depending on a desired error e) to the 
recursive preconditioning algorithm R-P-Chebyshev, described in Section 9. The time complexity of the 
solve phase depends on e, but more crucially on the quality of C, which is a function of the sparsifier quality. 

Definition 7.1 (re(re)-good chain) Let k{u) be a monotonically non- decreasing function of n. Let C = 
{A = A\, B\, A2, ■ ■ ■ ,Ad} be a chain of graphs, and denote by rii and rrii the numbers of nodes and edges 
of Ai respectively. We say that C is k(h)- good for A, if: 

1. Ai<Bi< K( ni )Ai. 

2. Ai+\ = GreedyElimination^). 

3. m.i/mi + i > c T \f K{ni), for some constant c r . 

Spielman and Teng analyzed a recursive preconditioned Chebyshev iteration and showed that a k{u)- 
good chain for A can be used to solve a system on La- This is captured by the following Lemma, adapted 
from Theorem 5.5 in [ST06]. 

Lemma 7.2 Given a K,{n)-good chain for A, a vector x such that \ \x — L~^b\\A < e||Lj[&m can be computed 
in 0{m? d mi^/ K{n{) log(l/e)) expected time. 
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For our solver, we follow the approach of Spielman and Teng. The main difference is that we replace 
their routine UltraSparsify with our routine IncrementalSparsify, which is not only faster but also 
constructs a better chain which translates into a faster solve phase. We are now ready to state our algorithm 
for building the chain. In what follows we write v := 0{g{riij) to mean l v := /(nj) for some explicitly 
known function f(n) £ 0(g(n))'. 



BuildChain 

Input: Graph A, scalar p with < p < 1 

Output: Chain of graphs C = {A = A\, B\,A2, . . . , AO 



1 


At := A 


2 


C := 


•> 
o 


wt In 1 1 Tii ■ {]r\<y loo" tj^^/^ f 1 r\ 

W 1111C / / 1*1 I lUu \\J^L 1 L 1 UU 


4 


if mi > logn then 


5 


^ := log n 


6 


else 


7 


£ := log logn 


<S 


end if 


9 


K := 0(log 4 nilog(l/p)) 


10 


Bi := IncrementalSparsify(A, K,p/(2£)) 


11 


A i+ i := GREEDYELIMINATION(i?i) 


12 


if TOj/mj + i < c r ^/3K then 


13 


return FAILURE 


14 


end if 


lo 


C = CU{Ai,Bi} 


16 


i:=i + l 


17 


end while 


18 


return C 



Lemma 7.3 Given a graph A, BuildChain(A,p) produces an 0(log 4 n)-good chain for A, with probability 
at least 1 — p. The algorithm runs in time 

0((m logn + re log 2 re) log(l/p)). 

Proof Assume that Bi has n« — 1 + rrii/k' edges. A key property of Greed yElimination is that if G 
is a graph with n — 1 + j edges, GreedyElimination(G) has at most 2j — 2 vertices and 3j — 3 edges 
[ST06]. Hence GreedyElimination^) has at most 3rrii/k' edges. It follows that m,/m i+ i > k'/3. 
Then, in order to satisfy the second requirement, we must have Ai ■< Bi ■< c'k' 2 Ai, for some sufficiently 
small constant c'. 

However, we also know that the call to IncrementalSparsify returns an incremental sparsifier Bi 
that 3K-approximates Aj. So it is necessary that c'k' 2 > 3k. Moreover, Bi has nj — 1 + 0(m,j log 2 n/n) 
edges, a number which we assumed is equal to ni — 1 + m,i/k'. The value assigned to k by the algorithm 
is taken to be the minimum that satisfies these two conditions. 

The probability that Bi has the above properties is by construction at least 1 — p/(21ogn) if > logn 
and 1 — p/{2 log logn) otherwise. The probability that the requirements hold for all i is then at least 

( 1 - pj (2 log n) ) log n ( 1 - pj (2 log log n) ) log log n 

> (l-p/2) 2 >l-p. 
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Finally note that each call to IncrementalSparsify takes 0((mj log 2 n) log(l/p)) time. Since mi 
decreases faster than geometrically with i, the claim about the running time follows. ■ 
Combining Lemmas 7.2 and 7.3 proves our main Theorem. 

Theorem 7.4 On input annxn symmetric diagonally dominant matrix A with m non-zero entries and a 
vectorb, a vector x satisfying \\x— A + b\\A < e\\A + b\\A, can be computed in expected time 0(m log 2 nlog(l/e)). 



8 Comments / Extensions 

Unraveling the analysis of our bound for the condition number of the incremental sparsifier, it can been 
that one logn factor is due to the number of samples required by the Rudelson and Vershynin theorem. 
The second log n factor is due to the average stretch of the low-stretch tree. 

It is quite possible that the low-stretch construction and perhaps the associated lower bound can be 
bypassed -at least for some graphs- by a simpler approach similar to that of [KM07] . Consider for example 
the case of unweighted graphs. With a simple ball-growing procedure we can concede in our incremental 
sparsifier a 1/logra fraction of the edges, while keeping within clusters of diameters 0(log 2 n) the rest of 
the edges. The design of low-stretch trees may be simplified within the small diameter clusters. This 
diameter-restricted local sparsification is a natural idea to pursue, at least in an actual implementation of 
the algorithm. 
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9 Appendix: The Complete Solver 

The purpose of this section is to provide a few more algebraic details about the chain of preconditioners, 
and the recursive preconditioned Chebyshev method which consists the solve phase of the solver. The 
material is not new and we include it only for completeness. We focus on pseudocode. We refer the reader 
to [ST06] for a more detailed exposition along with proofs. 

Direct methods - Cholesky factorization. If A is a symmetric and positive definite (SPD) matrix, 
it can be written in the form A = LL T , a product known as the Cholesky factorization of A. This extends 
to Laplacians, with some care for the null space. The Cholesky factorization can be computed via a 
symmetric version of Gaussian elimination. Given the decomposition, solving the systems Ly = b and 
L T x = y yields the solution to the system Ax = b; the key here is that solving with L and L T can be 
done easily via forward and back substitution. A partial Cholesky factorization with respect to the first k 
variables of A, puts it into the form 



where Ik denotes the k x k identity matrix, and A^ is known as the Schur complement of A with respect 
to the elimination of the k first variables. The matrix A^+i is the Schur complement of A^ with respect 
the the elimination of its first variable. 

Given a matrix A, the graph Ga of A is defined by identifying the vertices of G a with the rows and 
columns of A and letting the edges of Ga encode the non-zero structure of A in the obvious way. 

It is instructive to take a graph-theoretic look at the partial Cholesky factorization when k = 1. In 
this case, the graph Ga 1 contains a clique on the neighbors of the first node in Ga- In addition, the 
first column of L is non-zero on the corresponding coordinates. This problem is known as fill. It then 
becomes obvious that the complexity of computing the Cholesky factorization depends crucially on the 
ordering of A. Roughly speaking, a good ordering has the property that the degrees of the top nodes of 
A, A±, A2, ■ ■ ■ , Ak are as small as possible. The best known algorithm for positive definite systems of planar 
structure runs in time 0(n L5 ) and it is based on the computation of good orderings via nested dissection 
[Geo73, LRT79, AY]. 

There are two fairly simple but important facts considering the partial Cholesky factorization of equality 
9.2 [ST06]. First, if the top nodes of A,A\,... , Ak-\ have degrees 1 or 2, then back-substitution with L 
requires only O(n) time. Second, if A is a Laplacian, then A^ is a Laplacian. Such an ordering and the 
corresponding Laplacian Ak can be found in linear time via GreedyElimination, described in Section 7. 
The corresponding factor L can also be computed easily. 

Iterative methods. Unless the system matrix is very special, direct methods do not yield nearly-linear 
time algorithms. For example, the nested dissection algorithm is known to be asymptotically optimal for 
the class of planar SPD systems, within the envelope of direct methods. Iterative methods work around 
the fill problem by producing a sequence of approximate solutions using only matrix- vector multiplications 
and simple vector- vector operations. For example Richardson's iteration generates an approximate solution 
Xi + \ from X{, by letting 



The solver in this paper, as well as the Spielman and Teng solver [ST06], are based on the very 
well studied Chebyshev iteration [Axe94]. The preconditioned Chebyshev iteration (P-Chebyshev) is the 
Chebyshev iteration applied to the system B + Ax = B + b, where A, B are SPD matrices, and B is known 
as the preconditioner. The preconditioner B needs not be explicitly known. The iteration requires matrix- 
vector products with A and B + . A product of the form B +1 z is equivalent to solving the system By = c. 
Therefore (P-Chebyshev) requires access to only a function /#(c) returning B +1 c. In addition it requires 
a lower bound A m i n on the minimum eigenvalue of (^4, B) and an upper bound A max on the maximum 




(9.2) 



x i+ i = (I - A)xi + b. 
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generalized eigenvalue of (A,B). 



P-Chebyshev 

Input: SPD matrix A, vector b, number of iterations t, 

preconditioner fsiz), A min , A max 

Output: approximate solution x for Ax = b 

x : = 
r := b 

d • — (^rnax ~i~ ^min) /2 
C • — (^max A m j rl )/2 

for i = 1 to t do 

z := /bW 
if i = 1 then 

a; := 2 

q := 2/d 
else 

P := (ca/2) 2 

a := \/{d-P) 

x := z + j3x 
end if 
x := x + ax 
r := b — Ax 
end for 
return x 



A well known fact about the Chebyshev method is that after 0(-\/A max /A m in log 1/e) iterations the 
return vector x satisfies \\x — A + b\\ A < e || J 4 + 6|| j4 [Axe94]. 

Hybrid methods. One of the key ideas in Vaidya's approach was to combine direct and iterative 
methods into a hybrid method by exploiting properties of Laplacians. [Vai91]. For the rest of this section 
we will identify graphs and their Laplacians, using their natural 1-1 correspondence. 

Let Ai be a Laplacian. The incremental sparsifier B\ of A\ is a natural choice as preconditioner. With 
proper input parameters, IncrementalSparsify returns a B\ that contains enough degree 1 and 2 nodes, 
so that GreedyElimination can make enough progress reducing B\ to a matrix of the form 

Bl = Ll ( A 2 ) L ^ 

where A 2 is the output of algorithm GreedyElimination. Let Ij denote the identity of dimension j and 

III = ( Idim(A 2 ) ) 

Ql = ( Idim{A 1 )~dim{A 2 ) ) ■ 

Recall that P-Chebyshev requires the solution of By = c, which is given by 

v - L -t( QiLi'c \ 

The two matrix- vector products with L^ 1 ,L^ T can be computed in time 0{n) via forward and back 
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substitution. Therefore, we can solve a system in B by solving a linear system in A2 and performing 
0(n) additional work. Naturally, in order to solve systems on A2 we can recursively apply preconditioned 
Chebyshev iterations on it, with a new preconditioner B2. This defines a preconditioning chain C that 
consists of progressively smaller graphs A = A±, B±, A%, B2, . . . , Ad, along with the corresponding matrices 
Li,Ui,Qi for 1 < i < d— 1. So, to be more precise than in Section 7, routine BuildChain has the following 
specifications. 



BuildChain 

Input: Graph A, scalar p with < p < 1 
Output: Chain C = {{A i} B u Li, LTj, Qi}fzl,A d } 



We are now ready to give pseudocode for the recursive preconditioned Chebyshev iteration. 



R-P-Chebyshev 

Input: Chain C, level i, vector 6, number of iterations t 
Output: Approximate solution x for AiX = b 

1: if i = d for some fixed d then 
2: return Afb 
3: else 

4: k := n(Ai,Bi) 
5: Define function fi(z): 
6: t' := fl.33v^l 

7: z' := L^z 

8: '.= Q{Z 

9: z'2 := R-P-Chebyshev(C, i + 1, nV, t') 

10: fa) <- L^V Z^'] T 

11: /:=l-2e" 2 
12: n:=(l + 2e" 2 )K 

13: X :=P-CHEBYSHEV(Aj, 6, t, fi(z),l, u) 
14: return x 
15: end if 



The complete solver. Finally the pseudocode for the complete solver is as follows. 



Solve 

Input: Laplacian La, vector b, error e, failure probability p 
Output: Approximate solution x 

C := BuildChain^, p) 

x := R-P-CHEBYSHEv(C,l,fr,0(log 2 nlog(l/e)) 
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